Audio Source Separation Using a Deep Autoencoder

نویسندگان

  • Gil-Jin Jang
  • Han-Gyu Kim
  • Yung-Hwan Oh
چکیده

This paper proposes a novel framework for unsupervised audio source separation using a deep autoencoder. The characteristics of unknown source signals mixed in the mixed input is automatically by properly configured autoencoders implemented by a network with many layers, and separated by clustering the coefficient vectors in the code layer. By investigating the weight vectors to the final target, representation layer, the primitive components of the audio signals in the frequency domain are observed. By clustering the activation coefficients in the code layer, the previously unknown source signals are segregated. The original source sounds are then separated and reconstructed by using code vectors which belong to different clusters. The restored sounds are not perfect but yield promising results for the possibility in the success of many practical applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Denoising Autoencoders: A Fine-Tuning Scheme to Learn from Test Mixtures

This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder (DAE) in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show sensible denoising performance when the nature of artifacts is known in advance, the scalability of an already-trained network to an unseen signal...

متن کامل

Raw Multi-Channel Audio Source Separation using Multi-Resolution Convolutional Auto-Encoders

Supervised multi-channel audio source separation requires extracting useful spectral, temporal, and spatial features from the mixed signals. The success of many existing systems is therefore largely dependent on the choice of features used for training. In this work, we introduce a novel multi-channel, multiresolution convolutional auto-encoder neural network that works on raw time-domain signa...

متن کامل

Deep Transform: Cocktail Party Source Separation via Probabilistic Re-Synthesis

In cocktail party listening scenarios, the human brain is able to separate competing speech signals. However, the signal processing implemented by the brain to perform cocktail party listening is not well understood. Here, we trained two separate convolutive autoencoder deep neural networks (DNN) to separate monaural and binaural mixtures of two concurrent speech streams. We then used these DNN...

متن کامل

Adaptive Front-ends for End-to-end Source Separation

Source separation and other audio applications have traditionally relied on the use of short-time Fourier transforms as a front-end frequency domain representation step. We present an auto-encoder neural network that can act as an equivalent to short-time front-end transforms. We demonstrate the ability of the network to learn optimal, real-valued basis functions directly from the raw waveform ...

متن کامل

A joint separation-classification model for sound event detection of weakly labelled data

Source separation (SS) aims to separate individual sources from an audio recording. Sound event detection (SED) aims to detect sound events from an audio recording. We propose a joint separation-classification (JSC) model trained only on weakly labelled audio data, that is, only the tags of an audio recording are known but the time of the events are unknown. First, we propose a separation mappi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1412.7193  شماره 

صفحات  -

تاریخ انتشار 2014